Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
1.
medrxiv; 2023.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2023.09.11.23295259

ABSTRACT

Post-Acute Sequelae of SARS-CoV-2 infection (PASC), also known as Long-COVID, encompasses a variety of complex and varied outcomes following COVID-19 infection that are still poorly understood. We clustered over 600 million condition diagnoses from 14 million patients available through the National COVID Cohort Collaborative (N3C), generating hundreds of highly detailed clinical phenotypes. Assessing patient clinical trajectories using these clusters allowed us to identify individual conditions and phenotypes strongly increased after acute infection. We found many conditions increased in COVID-19 patients compared to controls, and using a novel method to predict patient/cluster assignment over time, we additionally found phenotypes specific to patient sex, age, wave of infection, and PASC diagnosis status. While many of these results reflect known PASC symptoms, the resolution provided by this unprecedented data scale suggests avenues for improved diagnostics and mechanistic understanding of this multifaceted disease.


Subject(s)
COVID-19 , Acute Disease
2.
arxiv; 2023.
Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2302.10800v1

ABSTRACT

Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of knowledge graphs is lacking. Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of knowledge graphs. Features include a simple, modular extract-transform-load (ETL) pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate knowledge graphs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph machine learning, including node embeddings and training of models for link prediction and node classification.


Subject(s)
COVID-19
3.
arxiv; 2022.
Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2210.02490v1

ABSTRACT

Post-acute sequelae of SARS-CoV-2 infection (PASC) or Long COVID is an emerging medical condition that has been observed in several patients with a positive diagnosis for COVID-19. Historical Electronic Health Records (EHR) like diagnosis codes, lab results and clinical notes have been analyzed using deep learning and have been used to predict future clinical events. In this paper, we propose an interpretable deep learning approach to analyze historical diagnosis code data from the National COVID Cohort Collective (N3C) to find the risk factors contributing to developing Long COVID. Using our deep learning approach, we are able to predict if a patient is suffering from Long COVID from a temporally ordered list of diagnosis codes up to 45 days post the first COVID positive test or diagnosis for each patient, with an accuracy of 70.48\%. We are then able to examine the trained model using Gradient-weighted Class Activation Mapping (GradCAM) to give each input diagnoses a score. The highest scored diagnosis were deemed to be the most important for making the correct prediction for a patient. We also propose a way to summarize these top diagnoses for each patient in our cohort and look at their temporal trends to determine which codes contribute towards a positive Long COVID diagnosis.


Subject(s)
COVID-19
4.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.08.15.22278603

ABSTRACT

BackgroundMore than one-third of individuals experience post-acute sequelae of SARS-CoV-2 infection (PASC, which includes long-COVID). ObjectiveTo identify risk factors associated with PASC/long-COVID. DesignRetrospective case-control study. Setting31 health systems in the United States from the National COVID Cohort Collaborative (N3C). Patients8,325 individuals with PASC (defined by the presence of the International Classification of Diseases, version 10 code U09.9 or a long-COVID clinic visit) matched to 41,625 controls within the same health system. MeasurementsRisk factors included demographics, comorbidities, and treatment and acute characteristics related to COVID-19. Multivariable logistic regression, random forest, and XGBoost were used to determine the associations between risk factors and PASC. ResultsAmong 8,325 individuals with PASC, the majority were >50 years of age (56.6%), female (62.8%), and non-Hispanic White (68.6%). In logistic regression, middle-age categories (40 to 69 years; OR ranging from 2.32 to 2.58), female sex (OR 1.4, 95% CI 1.33-1.48), hospitalization associated with COVID-19 (OR 3.8, 95% CI 3.05-4.73), long (8-30 days, OR 1.69, 95% CI 1.31-2.17) or extended hospital stay (30+ days, OR 3.38, 95% CI 2.45-4.67), receipt of mechanical ventilation (OR 1.44, 95% CI 1.18-1.74), and several comorbidities including depression (OR 1.50, 95% CI 1.40-1.60), chronic lung disease (OR 1.63, 95% CI 1.53-1.74), and obesity (OR 1.23, 95% CI 1.16-1.3) were associated with increased likelihood of PASC diagnosis or care at a long-COVID clinic. Characteristics associated with a lower likelihood of PASC diagnosis or care at a long-COVID clinic included younger age (18 to 29 years), male sex, non-Hispanic Black race, and comorbidities such as substance abuse, cardiomyopathy, psychosis, and dementia. More doctors per capita in the county of residence was associated with an increased likelihood of PASC diagnosis or care at a long-COVID clinic. Our findings were consistent in sensitivity analyses using a variety of analytic techniques and approaches to select controls. ConclusionsThis national study identified important risk factors for PASC such as middle age, severe COVID-19 disease, and specific comorbidities. Further clinical and epidemiological research is needed to better understand underlying mechanisms and the potential role of vaccines and therapeutics in altering PASC course. KEY POINTSO_ST_ABSQuestionC_ST_ABSWhat risk factors are associated with post-acute sequelae of SARS-CoV-2 (PASC) in the National COVID Cohort Collaborative (N3C) EHR Cohort? FindingsThis national study identified important risk factors for PASC such as middle age, severe COVID-19 disease, specific comorbidities, and the number of physicians per capita. MeaningClinicians can use these risk factors to identify patients at high risk for PASC while they are still in the acute phase of their infection and also to support targeted enrollment in clinical trials for preventing or treating PASC.


Subject(s)
Dementia , Substance-Related Disorders , Pulmonary Disease, Chronic Obstructive , Depressive Disorder , Psychoses, Substance-Induced , Obesity , COVID-19 , Cardiomyopathies
5.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.08.04.22278439

ABSTRACT

Objective To define pregnancy episodes and estimate gestational aging within electronic health record (EHR) data from the National COVID Cohort Collaborative (N3C). Materials and Methods We developed a comprehensive approach, named Hierarchy and rule-based pregnancy episode Inference integrated with Pregnancy Progression Signatures (HIPPS) and applied it to EHR data in the N3C from 1 January 2018 to 7 April 2022. HIPPS combines: 1) an extension of a previously published pregnancy episode algorithm, 2) a novel algorithm to detect gestational aging-specific signatures of a progressing pregnancy for further episode support, and 3) pregnancy start date inference. Clinicians performed validation of HIPPS on a subset of episodes. We then generated three types of pregnancy cohorts based on the level of precision for gestational aging and pregnancy outcomes for comparison of COVID-19 and other characteristics. Results We identified 628,165 pregnant persons with 816,471 pregnancy episodes, of which 52.3% were live births, 24.4% were other outcomes (stillbirth, ectopic pregnancy, spontaneous abortions), and 23.3% had unknown outcomes. We were able to estimate start dates within one week of precision for 431,173 (52.8%) episodes. 66,019 (8.1%) episodes had incident COVID-19 during pregnancy. Across varying COVID-19 cohorts, patient characteristics were generally similar though pregnancy outcomes differed. Discussion HIPPS provides support for pregnancy-related variables based on EHR data for researchers to define pregnancy cohorts. Our approach performed well based on clinician validation. Conclusion We have developed a novel and robust approach for inferring pregnancy episodes and gestational aging that addresses data inconsistency and missingness in EHR data.


Subject(s)
COVID-19 , Stillbirth
6.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.07.08.22277388

ABSTRACT

Acute COVID-19 infection can be followed by persistent or newly diagnosed manifestations in many different organ systems, referred to as Post Acute Sequelae of SARS-CoV2 Infection (PASC). Numerous studies have shown an increased risk of being diagnosed with new-onset psychiatric disease in the first 21-120 days following a diagnosis of acute COVID-19. However, it was unclear whether non-psychiatric PASC-associated manifestations (PASC-AMs) are associated with an increased risk of receiving a diagnosis of new-onset psychiatric disease following COVID-19. Here, we perform a retrospective electronic health record (EHR) cohort study to evaluate whether non-psychiatric PASC-AMs can predict whether patients will receive a diagnosis of new-onset psychiatric disease. Data were obtained from the National COVID Cohort Collaborative (N3C), which has EHR data from 65 clinical organizations which are harmonized using the Observational Medical Outcomes Partnership (OMOP) data model. Non-psychiatric PASC-AMs were recorded 21-120 days following SARS-CoV-2 diagnosis and before diagnosis of new-onset psychiatric disease. OMOP codes were mapped to 178 Human Phenotype Ontology (HPO) terms that represent PASC-AMs. Logistic regression was applied to predict newly diagnosed psychiatric disease occurrence based on age, sex, race, pre-existing comorbidities, and PASC-AMs in eleven categories. The cohort of 1,135,973 individuals with acute COVID-19 had a mean age of 40.5 years and included 56.0% females. We found a significant association for seven of the HPO categories with newly diagnosed psychiatric disease, with odds ratios highest for neurological (2.30, 2.24-2.36) and cardiovascular (1.77, 1.69-1.85) PASC-AMs. Secondary analysis revealed that the proportions of 95 of 154 individual phenotypic features differed significantly among patients diagnosed with different psychiatric diseases (anxiety, mood disorders, dementia, and psychosis). Neurological, pulmonary, gastrointestinal, endocrine, cardiovascular, constitutional, and ENT PASC-AMs are each associated with an increased risk of newly diagnosed psychiatric disease. This suggests that the total burden of PASC-AMs influences the risk of receiving a diagnosis of a new-onset psychiatric disease. This finding may be used to inform psychiatric screening following acute COVID-19 by identifying high-risk patients.


Subject(s)
Anxiety Disorders , Dementia , Mood Disorders , Mental Disorders , Severe Acute Respiratory Syndrome , Psychoses, Substance-Induced , COVID-19
7.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.05.24.22275398

ABSTRACT

Accurate stratification of patients with Post-acute sequelae of SARS-CoV-2 infection (PASC, or long COVID) would allow precision clinical management strategies and could enable more focussed investigation of the molecular pathogenetic mechanisms of this disease. However, the natural history of long COVID is incompletely understood and characterized by an extremely wide range of manifestations that are difficult to analyze computationally. In addition, the generalizability of machine learning classification of COVID-19 clinical outcomes has rarely been tested. We present a method for computationally modeling long COVID phenotype data based on electronic healthcare records (EHRs) and for assessing pairwise phenotypic similarity between patients using semantic similarity. Using unsupervised machine learning (k-means clustering), we found six distinct clusters of long COVID patients, each with distinct profiles of phenotypic abnormalities with enrichments in pulmonary, cardiovascular, neuropsychiatric, and constitutional symptoms such as fatigue and fever. There was a highly significant association of cluster membership with a range of pre-existing conditions and with measures of severity during acute COVID-19. We show that the clusters we identified in one hospital system were generalizable across different hospital systems. Semantic phenotypic clustering can provide a foundation for assigning patients to stratified subgroups for natural history or therapy studies on long COVID.


Subject(s)
COVID-19 , Fever , Fatigue , Disease
8.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.01.27.22269865

ABSTRACT

Background: Reports of SARS-CoV-2 causing laryngotracheobronchitis (commonly known as croup) have been limited to small case series. Early reports suggest the Omicron (B.1.1.529) strain of SARS-CoV-2 (the dominant circulating US strain since the week of 12/25/2021) replicates more efficiently in the conducting airways. This may increase the risk of a croup phenotype in children as they have smaller airway calibers. Methods: Description of the incidence, change over time, and characteristics of children with SARS-CoV-2 and upper airway infection (UAI) diagnoses within the National COVID Cohort Collaborative (N3C) before and during the rise of the Omicron variant. We compare the demographics, comorbidities, and clinical outcomes of hospitalized SARS-CoV-2 positive children with and without UAI. Results: SARS-CoV-2 positive UAI cases increased to the highest number per month (N = 170) in December 2021 as the Omicron variant became dominant. Of 15,806 hospitalized children with SARS-CoV-2, 1.5% (234/15,806) had an UAI diagnosis. Those with UAI were more likely to be male, younger, white, have asthma and develop severe disease as compared to those without UAI. Conclusions: Pediatric acute UAI cases have increased during the Omicron variant surge with many developing severe disease. Improved understanding of this emerging clinical phenotype could aid in therapeutic decision-making and healthcare resource planning.


Subject(s)
Airway Obstruction , Asthma
9.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.11.30.21267071

ABSTRACT

BackgroundCOVID-19 has been shown to increase the risk of adverse mental health consequences. A recent electronic health record (EHR)-based observational study showed an almost two-fold increased risk of new-onset mental illness in the first 90 days following a diagnosis of acute COVID-19. MethodsWe used the National COVID Cohort Collaborative, a harmonized EHR repository with 2,965,506 COVID-19 positive patients, and compared cohorts of COVID-19 patients with comparable controls. Patients were propensity score-matched to control for confounding factors. We estimated the hazard ratio (COVID-19:control) for new-onset of mental illness for the first year following diagnosis. We additionally estimated the change in risk for new-onset mental illness between the periods of 21-120 and 121-365 days following infection. FindingsWe find a significant increase in incidence of new-onset mental disorders in the period of 21-120 days following COVID-19 (3.8%, 3.6-4.0) compared to patients with respiratory tract infections (3%, 2.8-3.2). We further show that the risk for new-onset mental illness decreases over the first year following COVID-19 diagnosis compared to other respiratory tract infections and demonstrate a reduced (non-significant) hazard ratio over the period of 121-365 days following diagnosis. Similar findings are seen for new-onset anxiety disorders but not for mood disorders. InterpretationPatients who have recovered from COVID-19 are at an increased risk for developing new-onset mental illness, especially anxiety disorders. This risk is most prominent in the first 120 days following infection. FundingNational Center for Advancing Translational Sciences (NCATS).


Subject(s)
Anxiety Disorders , Mood Disorders , Mental Disorders , Respiratory Tract Infections , Intellectual Disability , COVID-19
10.
arxiv; 2021.
Preprint in English | PREPRINT-ARXIV | ID: ppzbmed-2110.10780v3

ABSTRACT

While we pay attention to the latest advances in clinical natural language processing (NLP), we can notice some resistance in the clinical and translational research community to adopt NLP models due to limited transparency, interpretability, and usability. In this study, we proposed an open natural language processing development framework. We evaluated it through the implementation of NLP algorithms for the National COVID Cohort Collaborative (N3C). Based on the interests in information extraction from COVID-19 related clinical notes, our work includes 1) an open data annotation process using COVID-19 signs and symptoms as the use case, 2) a community-driven ruleset composing platform, and 3) a synthetic text data generation workflow to generate texts for information extraction tasks without involving human subjects. The corpora were derived from texts from three different institutions (Mayo Clinic, University of Kentucky, University of Minnesota). The gold standard annotations were tested with a single institution's (Mayo) ruleset. This resulted in performances of 0.876, 0.706, and 0.694 in F-scores for Mayo, Minnesota, and Kentucky test datasets, respectively. The study as a consortium effort of the N3C NLP subgroup demonstrates the feasibility of creating a federated NLP algorithm development and benchmarking platform to enhance multi-institution clinical NLP study and adoption. Although we use COVID-19 as a use case in this effort, our framework is general enough to be applied to other domains of interest in clinical NLP.


Subject(s)
COVID-19
11.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.07.19.21260767

ABSTRACT

Importance: SARS-CoV-2 Objective: To determine the characteristics, changes over time, outcomes, and severity risk factors of SARS-CoV-2 affected children within the National COVID Cohort Collaborative (N3C) Design: Prospective cohort study of encounters with end dates before May 27th, 2021. Setting: 45 N3C institutions Participants: Children < 19-years-old at initial SARS-CoV-2 testing Main Outcomes and Measures: Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs MIS-C contrasts for children infected with SARS-CoV-2. Results: 728,047 children in the N3C were tested for SARS-CoV-2; of these, 91,865 (12.6%) were positive. Among the 5,213 (6%) hospitalized children, 685 (13%) met criteria for severe disease: mechanical ventilation (7%), vasopressor/inotropic support (7%), ECMO (0.6%), or death/discharge to hospice (1.1%). Male gender, African American race, older age, and several pediatric complex chronic condition (PCCC) subcategories were associated with higher clinical severity (p [≤] 0.05). Vital signs (all p [≤] 0.002) and many laboratory tests from the first day of hospitalization were predictive of peak disease severity. Children with severe (vs moderate) disease were more likely to receive antimicrobials (71% vs 32%, p < 0.001) and immunomodulatory medications (53% vs 16%, p < 0.001). Compared to those with acute COVID-19, children with MIS-C were more likely to be male, Black/African American, 1-to-12-years-old, and less likely to have asthma, diabetes, or a PCCC (p < 0.04). MIS-C cases demonstrated a more inflammatory laboratory profile and more severe clinical phenotype with higher rates of invasive ventilation (12% vs 6%) and need for vasoactive-inotropic support (31% vs 6%) compared to acute COVID-19 cases, respectively (p <0.03). Conclusions: In the largest U.S. SARS-CoV-2-positive pediatric cohort to date, we observed differences in demographics, pre-existing comorbidities, and initial vital sign and laboratory test values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes.


Subject(s)
COVID-19 , Diabetes Mellitus , Asthma , Death
12.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.06.23.21259416

ABSTRACT

Importance: Since late 2019, the novel coronavirus SARS-CoV-2 has given rise to a global pandemic and introduced many health challenges with economic, social, and political consequences. In addition to a complex acute presentation that can affect multiple organ systems, there is mounting evidence of various persistent long-term sequelae. The worldwide scientific community is characterizing a diverse range of seemingly common long-term outcomes associated with SARS-CoV-2 infection, but the underlying assumptions in these studies vary widely making comparisons difficult. Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 infection (PASC or long COVID), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations of long COVID. Observations: We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts of individuals three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to Human Phenotype Ontology (HPO) terms. Conclusions and Relevance: Patients and clinicians often use different terms to describe the same symptom or condition. Addressing the heterogeneous and inconsistent language used to describe the clinical manifestations of long COVID combined with the lack of standardized terminologies for long COVID will provide a necessary foundation for comparison and meta-analysis of different studies. Translating long COVID manifestations into computable HPO terms will improve the analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared or pooled more effectively. Furthermore, mapping lay terminology to HPO for long COVID manifestations will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, which may improve the stratification and thereby diagnosis and treatment of long COVID.


Subject(s)
COVID-19
13.
medrxiv; 2021.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2021.03.20.21253896

ABSTRACT

Since late 2019, the novel coronavirus SARS-CoV-2 has introduced a wide array of health challenges globally. In addition to a complex acute presentation that can affect multiple organ systems, increasing evidence points to long-term sequelae being common and impactful. As the worldwide scientific community forges ahead with efforts to characterize a wide range of outcomes associated with SARS-CoV-2 infection, the proliferation of available data has made it clear that formal definitions are needed in order to design robust and consistent studies of Long COVID that consistently capture variation in long-term outcomes. In the present study, we investigate the definitions used in the literature published to date and compare them against data available from electronic health records and patient-reported information collected via surveys. Long COVID holds the potential to produce a second public health crisis on the heels of the pandemic. Proactive efforts to identify the characteristics of this heterogeneous condition are imperative for a rigorous scientific effort to investigate and mitigate this threat.


Subject(s)
COVID-19
SELECTION OF CITATIONS
SEARCH DETAIL